Learning Gene Functional Classifications from Multiple Data Types

نویسندگان

  • Paul Pavlidis
  • Jason Weston
  • Jinsong Cai
  • William Stafford Noble
چکیده

In our attempts to understand cellular function at the molecular level, we must be able to synthesize information from disparate types of genomic data. We consider the problem of inferring gene functional classifications from a heterogeneous data set consisting of DNA microarray expression measurements and phylogenetic profiles from whole-genome sequence comparisons. We demonstrate the application of the support vector machine (SVM) learning algorithm to this functional inference task. Our results suggest the importance of exploiting prior information about the heterogeneity of the data. In particular, we propose an SVM kernel function that is explicitly heterogeneous. In addition, we describe feature scaling methods for further exploiting prior knowledge of heterogeneity by giving each data type different weights.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploring Language Learning Strategy Use: The Role of Multiple Intelligences, L2 Proficiency and Gender

Multiple intelligences (MI), second/foreign (L2) proficiency and gender are postulated to contribute to language learning strategies (LLS). This study, first, examined whether there was any relationship between Iranian EFL learners’ LLS, on the one hand, and their MI, L2 proficiency, on the other hand. In so doing, it explored the relationship of the overall L2 strategy as well as individual st...

متن کامل

Cancer classification using Machine Learning Technique on Microarray Data

The challenge of cancer treatment has been to identify distinct tumor types to enable the selection of specific therapies aimed to maximize efficacy and minimize toxicity. Improvements in cancer classifications have thus been central in the advances of cancer treatment. Today, most cancer classifications have been based primarily on morphological appearance of the tumor cell determined by the o...

متن کامل

A Probabilistic Learning Approach to Whole-Genome Operon Prediction

We present a computational approach to predicting operons in the genomes of prokaryotic organisms. Our approach uses machine learning methods to induce predictive models for this task from a rich variety of data types including sequence data, gene expression data, and functional annotations associated with genes. We use multiple learned models that individually predict promoters, terminators an...

متن کامل

FIDDLE: An integrative deep learning framework for functional genomic data inference

Numerous advances in sequencing technologies have revolutionized genomics through generating many types of genomic functional data. Statistical tools have been developed to analyze individual data types, but there lack strategies to integrate disparate datasets under a unified framework. Moreover, most analysis techniques heavily rely on feature selection and data preprocessing which increase t...

متن کامل

Learning Gene Functional Classi cations from Multiple Data Types

In our attempts to understand cellular function at the molecular level, we must be able to synthesize information from disparate types of genomic data. We consider the problem of inferring gene functional classiŽ cations from a heterogeneous data set consisting of DNA microarray expression measurements and phylogenetic proŽ les from whole-genome sequence comparisons. We demonstrate the applicat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of computational biology : a journal of computational molecular cell biology

دوره 9 2  شماره 

صفحات  -

تاریخ انتشار 2002